108 research outputs found

    Pupillary dilation response reflects surprising moments in music

    Get PDF
    There are indications that the pupillary dilation response (PDR) reflects surprising moments in an auditory sequence such as the appearance of a deviant noise against repetitively presented pure tones (Liao, Yoneya, Kidani, Kashino, & Furukawa, 2016), and salient and loud sounds that are evaluated by human participants subjectively (Liao, Kidani, Yoneya, Kashino, & Furukawa, 2016). In the current study, we further examined whether the reflection of PDR in auditory surprise can be accumulated and revealed in complex and yet structured auditory stimuli, i.e., music, and when the surprise is defined subjectively. Participants listened to 15 excerpts of music while their pupillary responses were recorded. In the surprise-rating session, participants rated how surprising an instance in the excerpt was, i.e., rich in variation versus monotonous, while they listened to it. In the passive-listening session, they listened to the same 15 excerpts again but were not involved in any task. The pupil diameter data obtained from both sessions were time-aligned to the rating data obtained from the surprise-rating session. Results showed that in both sessions, mean pupil diameter was larger at moments rated more surprising than unsurprising. The result suggests that the PDR reflects surprise in music automatically

    Pose Estimation for Human Wearing Loose-Fitting Clothes: Obtaining Ground Truth Posture Using HFR Camera and Blinking LEDs

    Full text link
    Human pose estimation, particularly in athletes, can help improve their performance. However, this estimation is difficult using existing methods, such as human annotation, if the subjects wear loose-fitting clothes such as ski/snowboard wears. This study developed a method for obtaining the ground truth data on two-dimensional (2D) poses of a human wearing loose-fitting clothes. This method uses fast-flushing light-emitting diodes (LEDs). The subjects were required to wear loose-fitting clothes and place the LED on the target joints. The LEDs were observed directly using a camera by selecting thin filmy loose-fitting clothes. The proposed method captures the scene at 240 fps by using a high-frame-rate camera and renders two 30 fps image sequences by extracting LED-on and -off frames. The temporal differences between the two video sequences can be ignored, considering the speed of human motion. The LED-on video was used to manually annotate the joints and thus obtain the ground truth data. Additionally, the LED-off video, equivalent to a standard video at 30 fps, confirmed the accuracy of existing machine learning-based methods and manual annotations. Experiments demonstrated that the proposed method can obtain ground truth data for standard RGB videos. Further, it was revealed that neither manual annotation nor the state-of-the-art pose estimator obtains the correct position of target joints.Comment: Extended abstract of WACV2023 workshop on Computer Vision 4 Winter Sport

    Inhibition-excitation balance in the parietal cortex modulates volitional control for auditory and visual multistability

    Get PDF
    International audiencePerceptual organisation must select one interpretation from several alternatives to guide behaviour. Computational models suggest that this could be achieved through an interplay between inhibition and excitation across competing types of neural population coding for each interpretation. Here, to test for such models, we used magnetic resonance spectroscopy to measure non-invasively the concentrations of inhibitory γ-aminobutyric acid (GABA) and excitatory glutamate-glutamine (Glx) in several brain regions. Human participants first performed auditory and visual multistability tasks that produced spontaneous switching between percepts. Then, we observed that longer percept durations during behaviour were associated with higher GABA/Glx ratios in the sensory area coding for each modality. When participants were asked to voluntarily modulate their perception, a common factor across modalities emerged: the GABA/Glx ratio in the posterior parietal cortex tended to be positively correlated with the amount of effective volitional control. Our results provide direct evidence implicating that the balance between neural inhibition and excitation within sensory regions resolves perceptual competition. This powerful computational principle appears to be leveraged by both audition and vision, implemented independently across modalities, but modulated by an integrated control process. Perceptual multistability describes an intriguing situation, whereby an observer reports random changes in conscious perception for a physically unchanging stimulus 1,2. Multistability is a powerful tool with which to probe perceptual organisation, as it highlights perhaps the most fundamental issue faced by perception for any reasonably complex natural scene. And because the information encoded by sensory receptors is never sufficient to fully specify the state of the outside world 3 , at each instant perception must always choose between a number of competing alternatives. In realistic situations, the process produces a stable and useful representation of the world. In situations with intrinsically ambiguous information, the same process is revealed as multistable perception. A number of theoretical models have converged to pinpoint the generic computational principles likely to be required to explain multistability, and hence perceptual organisation 4-9. All of these models consider three core ingredients: inhibition between competing neural populations, adaptation within these populations, and neuronal noise. The precise role of each ingredient and their respective importance is still being debated. Noise is introduced to induce fluctuations in each population and initiate the stochastic perceptual switching in some models 7-9 , whereas switching dynamics are solely determined by inhibition in others 5,6. Functional brain imaging in humans has provided results qualitatively compatible with those computational principles at several levels of the visual processing hierarchy 10. But, for most functional imaging techniques in humans such as fMRI or MEG/EEG, change

    INTER-TRIAL DIFFERENCE ANALYSIS THROUGH APPEARANCE-BASED MOTION TRACKING

    Get PDF
    The purpose of this study is to develop a method for quantitative evaluation and visualization of inter-trial differences in the motion of athletes. Previous methods for kinematic analyses of human movement have required attaching specific equipment to a body segment or can only be used in an environment designed for analyses. Therefore, they are difficult to use for observing motions in real games. To enhance the applicability to real-game situations, we propose appearance-based motion tracking. Our method only requires an image sequence from a camera. From the image sequence, automatic detection of trials and a difference analysis of them are conducted. We applied our method to the analysis of pitching motions in actual baseball games. Though we have no quantitative evaluations yet, the experimental results imply the efficacy of our method

    Perceptual Restoration of Temporally Distorted Speech in L1 vs. L2: Local Time Reversal and Modulation Filtering

    Get PDF
    Speech is intelligible even when the temporal envelope of speech is distorted. The current study investigates how native and non-native speakers perceptually restore temporally distorted speech. Participants were native English speakers (NS), and native Japanese speakers who spoke English as a second language (NNS). In Experiment 1, participants listened to “locally time-reversed speech” where every x-ms of speech signal was reversed on the temporal axis. Here, the local time reversal shifted the constituents of the speech signal forward or backward from the original position, and the amplitude envelope of speech was altered as a function of reversed segment length. In Experiment 2, participants listened to “modulation-filtered speech” where the modulation frequency components of speech were low-pass filtered at a particular cut-off frequency. Here, the temporal envelope of speech was altered as a function of cut-off frequency. The results suggest that speech becomes gradually unintelligible as the length of reversed segments increases (Experiment 1), and as a lower cut-off frequency is imposed (Experiment 2). Both experiments exhibit the equivalent level of speech intelligibility across six levels of degradation for native and non-native speakers respectively, which poses a question whether the regular occurrence of local time reversal can be discussed in the modulation frequency domain, by simply converting the length of reversed segments (ms) into frequency (Hz)

    Rapid Change in Articulatory Lip Movement Induced by Preceding Auditory Feedback during Production of Bilabial Plosives

    Get PDF
    BACKGROUND: There has been plentiful evidence of kinesthetically induced rapid compensation for unanticipated perturbation in speech articulatory movements. However, the role of auditory information in stabilizing articulation has been little studied except for the control of voice fundamental frequency, voice amplitude and vowel formant frequencies. Although the influence of auditory information on the articulatory control process is evident in unintended speech errors caused by delayed auditory feedback, the direct and immediate effect of auditory alteration on the movements of articulators has not been clarified. METHODOLOGY/PRINCIPAL FINDINGS: This work examined whether temporal changes in the auditory feedback of bilabial plosives immediately affects the subsequent lip movement. We conducted experiments with an auditory feedback alteration system that enabled us to replace or block speech sounds in real time. Participants were asked to produce the syllable /pa/ repeatedly at a constant rate. During the repetition, normal auditory feedback was interrupted, and one of three pre-recorded syllables /pa/, /Φa/, or /pi/, spoken by the same participant, was presented once at a different timing from the anticipated production onset, while no feedback was presented for subsequent repetitions. Comparisons of the labial distance trajectories under altered and normal feedback conditions indicated that the movement quickened during the short period immediately after the alteration onset, when /pa/ was presented 50 ms before the expected timing. Such change was not significant under other feedback conditions we tested. CONCLUSIONS/SIGNIFICANCE: The earlier articulation rapidly induced by the progressive auditory input suggests that a compensatory mechanism helps to maintain a constant speech rate by detecting errors between the internally predicted and actually provided auditory information associated with self movement. The timing- and context-dependent effects of feedback alteration suggest that the sensory error detection works in a temporally asymmetric window where acoustic features of the syllable to be produced may be coded

    Imitation learning from unsegmented human motion based on N-gram statistics of linear prediction models

    No full text

    Auditory Feedback Assists Post hoc Error Correction of Temporal Reproduction, and Perception of Self-Produced Time Intervals in Subsecond Range

    No full text
    We examined whether auditory feedback assists the post hoc error correction of temporal reproduction, and the perception of self-produced time intervals in the subsecond and suprasecond ranges. Here, we employed a temporal reproduction task with a single motor response at a point in time with and without auditory feedback. This task limits participants to reducing errors by employing auditory feedback in a post hoc manner. Additionally, the participants were asked to judge the self-produced timing in this task. The results showed that, in the presence of auditory feedback, the participants exhibited smaller variability and bias in terms of temporal reproduction and the perception of self-produced time intervals in the subsecond range but not in the suprasecond range. Furthermore, in the presence of auditory feedback, the positive serial dependency of temporal reproduction, which is the tendency of reproduced intervals to be similar to those in adjacent trials, was reduced in the subsecond range but not in the suprasecond range. These results suggest that auditory feedback assists the post hoc error correction of temporal reproduction, and the perception of self-produced time intervals in the subsecond range

    Different roles of COMT and HTR2A genotypes in working memory subprocesses

    Get PDF
    Working memory is linked to the functions of the frontal areas, in which neural activity is mediated by dopaminergic and serotonergic tones. However, there is no consensus regarding how the dopaminergic and serotonergic systems influence working memory subprocesses. The present study used an imaging genetics approach to examine the interaction between neurochemical functions and working memory performance. We focused on functional polymorphisms of the catechol-O-methyltransferase (COMT) Val158Met and serotonin 2A receptor (HTR2A) -1438G/A genes, and devised a delayed recognition task to isolate the encoding, retention, and retrieval processes for visual information. The COMT genotypes affected recognition accuracy, whereas the HTR2A genotypes were associated with recognition response times. Activations specifically related to working memory were found in the right frontal and parietal areas, such as the middle frontal gyrus (MFG), inferior frontal gyrus (IFG), anterior cingulate cortex (ACC), and inferior parietal lobule (IPL). MFG and ACC/IPL activations were sensitive to differences between the COMT genotypes and between the HTR2A genotypes, respectively. Structural equation modeling demonstrated that stronger connectivity in the ACC-MFG and ACC-IFG networks is related to better task performance. The behavioral and fMRI results suggest that the dopaminergic and serotonergic systems play different roles in the working memory subprocesses and modulate closer cooperation between lateral and medial frontal activations
    corecore